---
title: "Bayesian Semi-NMF Code Demonstration"
output:
  html_document:
    df_print: paged
---

Firstly, put the script into working directory, and source the script 

```{r}
source("Bayesian.PMF.R")
```

We create a synthetic dataset to demonstrate our factorization algorithm implementation.

```{r}
library('MCMCpack')   # to sample from Dirichlet distribution

# A
K.star       <- 5                  # number of asset classes
alpha.W.star <- seq(1, K.star)   # concentration parameter for the Dirichlet distribution

# dimensions of observed X
n            <- 50 ## number of banks
T            <- 30 ## number of days

# B: mean and sd of Gaussian 
mu.V.star     <- 0
sigma.V.star  <- 1

# NOISE - eps
k.eps.star      <- 1     # shape for inv gamma prior
theta.eps.star  <- 1   # rate


# Generate the true data
sd.eps.star <- sqrt(rinvgamma(1, k.eps.star, theta.eps.star))   
W.star      <- rdirichlet(n, alpha.W.star)
V.star      <- matrix(rnorm(K.star*T, mean = mu.V.star, sd = sigma.V.star), nrow = K.star, byrow = TRUE)
E.star      <- matrix(rnorm(n*T, mean = 0, sd = sd.eps.star), nrow = n, byrow = TRUE)
Z           <- W.star %*% V.star + E.star  
```

Let's test if our algorithm can factorize matrix Z into factors W and V: 

```{r}
# INITIAL VALUES
K       <- K.star
W.start <- matrix(rep(1, n*K), nrow = n, byrow = TRUE) / K    # should use distributed values better for proposal -- this is just for illustration
V.start <- matrix(runif(K*T, -2, 2), nrow = K, byrow = TRUE)  

results <- run.mcmc.matrix.factorization (no.iter = 40000, K = K.star, proposal.step = 0.1, 
                               X = Z, W.start = W.start, alpha.W = rep(1, K), 
                               V.start = V.start, mu.V = 0.1, sigma.V = 0.75, sd.eps.start = 2) 
```

Once we finished estimation, we should check for `acceptance rate`:

```{r}
results$accept
```

Here we are getting a very high acceptance rate; we would of course tune the `proposal.step` size in practice.

Next, we test for convergence of samples. Here, we use Geweke test to test for convergence of `eps` and `V`. Take burnin to be 10000 samples

```{r}
N      <- 40000
burnin <- 10000
z.eps <- geweke.test(results$eps[burnin:N], intervals = 50, len = 200)
plot(z.eps)
```

The plot indicates that the test statistic `z.eps` is within -2 and 2, so the chain for `eps` is likely to have converged. 
Similarly, to test for V and W, we only take a portion of samples, say a few thousand to avoid memory problem.

```{r}
v.samples <- unlist(results$V[39900:N])
length(v.samples)   # 15150
z.v <- geweke.test(v.samples, intervals = 50, len = 200)
plot(z.v)

w.samples <- unlist(results$W[39900:N])
length(w.samples)   # 15150
z.w <- geweke.test(w.samples, intervals = 50, len = 200)
plot(z.w)
```

We can see that the chains for V and W are likely to have converged. We can further check by plotting the distributions of V and W.

```{r}
par(mfrow=c(2,1))
hist(v.samples, breaks = 20)
hist(w.samples, breaks = 20)
```

To compare with original data-generating distribution, 

```{r}
eps.samples <- results$eps[burnin:N]
hist(eps.samples, breaks = 20)

mean(eps.samples)
sd.eps.star

## W
mean(w.samples)
sd(w.samples)

mean(W.star)
sd(W.star)

## V
mean(v.samples)
sd(v.samples)

mu.V.star
sigma.V.star
```

We can see that our learnt distribution matched quite well in terms of mean and sd. 

Now, we want to check for the quality of our factorization using the posteriors learnt in MCMC. One way to do this is to use means of posteriors for W and V and check their product against Z:

```{r}
W.m <- results$W[[burnin]]
start <- burnin + 1
for (j in start:N) {
  W.m <- W.m + results$W[[j]]
}
W.m <- W.m / (N - start)

V.m <- results$V[[burnin]]
for (j in start:N) {
  V.m <- V.m + results$V[[j]]
}
V.m <- V.m / (N - start)

print("Frobenius norm for the difference")
sqrt(sum((Z - W.start %*% V.start)^2))     # starting point 
sqrt(sum((Z - W.m %*% V.m)^2))             # posterior means
sqrt(sum((Z - W.star %*% V.star)^2 ))      # true data 

```



